Fast Linearization of Tree Kernels over Large-Scale Data
نویسندگان
چکیده
Convolution tree kernels have been successfully applied to many language processing tasks for achieving state-of-the-art accuracy. Unfortunately, higher computational complexity of learning with kernels w.r.t. using explicit feature vectors makes them less attractive for large-scale data. In this paper, we study the latest approaches to solve such problems ranging from feature hashing to reverse kernel engineering and approximate cutting plane training with model compression. We derive a novel method that relies on reverse-kernel engineering together with an efficient kernel learning method. The approach gives the advantage of using tree kernels to automatically generate rich structured feature spaces and working in the linear space where learning and testing is fast. We experimented with training sets up to 4 million examples from Semantic Role Labeling. The results show that (i) the choice of correct structural features is essential and (ii) we can speed-up training from weeks to less than 20 minutes.
منابع مشابه
Fast subtree kernels on graphs
In this article, we propose fast subtree kernels on graphs. On graphs with n nodes and m edges and maximum degree d, these kernels comparing subtrees of height h can be computed in O(mh), whereas the classic subtree kernel by Ramon & Gärtner scales as O(n4h). Key to this efficiency is the observation that the Weisfeiler-Lehman test of isomorphism from graph theory elegantly computes a subtree k...
متن کاملEfficient Linearization of Tree Kernel Functions
The combination of Support Vector Machines with very high dimensional kernels, such as string or tree kernels, suffers from two major drawbacks: first, the implicit representation of feature spaces does not allow us to understand which features actually triggered the generalization; second, the resulting computational burden may in some cases render unfeasible to use large data sets for trainin...
متن کاملTunable GMM Kernels
The recently proposed “generalized min-max” (GMM) kernel [9] can be efficiently linearized, with direct applications in large-scale statistical learning and fast near neighbor search. The linearized GMM kernel was extensively compared in [9] with linearized radial basis function (RBF) kernel. On a large number of classification tasks, the tuning-free GMM kernel performs (surprisingly) well comp...
متن کاملTree Kernels for Semantic Role Labeling
The availability of large scale data sets of manually annotated predicate–argument structures has recently favored the use of machine learning approaches to the design of automated semantic role labeling (SRL) systems. The main research in this area relates to the design choices for feature representation and for effective decompositions of the task in different learning models. Regarding the f...
متن کاملLarge-Scale Learning with Structural Kernels for Class-Imbalanced Datasets
Much of the success in machine learning can be attributed to the ability of learning methods to adequately represent, extract, and exploit inherent structure present in the data under interest. Kernel methods represent a rich family of techniques that harvest on this principle. Domain-specific kernels are able to exploit rich structural information present in the input data to deliver state of ...
متن کامل